16 research outputs found
Superclustering by finding statistically significant separable groups of optimal gaussian clusters
The paper presents the algorithm for clustering a dataset by grouping the
optimal, from the point of view of the BIC criterion, number of Gaussian
clusters into the optimal, from the point of view of their statistical
separability, superclusters.
The algorithm consists of three stages: representation of the dataset as a
mixture of Gaussian distributions - clusters, which number is determined based
on the minimum of the BIC criterion; using the Mahalanobis distance, to
estimate the distances between the clusters and cluster sizes; combining the
resulting clusters into superclusters using the DBSCAN method by finding its
hyperparameter (maximum distance) providing maximum value of introduced matrix
quality criterion at maximum number of superclusters. The matrix quality
criterion corresponds to the proportion of statistically significant separated
superclusters among all found superclusters.
The algorithm has only one hyperparameter - statistical significance level,
and automatically detects optimal number and shape of superclusters based of
statistical hypothesis testing approach. The algorithm demonstrates a good
results on test datasets in noise and noiseless situations. An essential
advantage of the algorithm is its ability to predict correct supercluster for
new data based on already trained clusterer and perform soft (fuzzy)
clustering. The disadvantages of the algorithm are: its low speed and
stochastic nature of the final clustering. It requires a sufficiently large
dataset for clustering, which is typical for many statistical methods.Comment: 32 pages, 7 figures, 1 tabl
Improving the spatial resolution by effective subtraction technique at Irkutsk incoherent scatter radar: the theory and experiment
We describe a sounding technique that allows us to improve spatial resolution
of Irkutsk Incoherent Scatter Radar without loosing spectral resolution. The
technique is based on transmitting of rectangle pulses of different duration in
various sounding runs and subtracting correlation matrixes. Theoretically and
experimentally we have shown, that subtraction of the mean-square parameters of
the scattered signal for different kinds of the sounding signal one from
another allows us to solve the problem within the framework of quasi-static
ionospheric parameters approximation.Comment: 4 pages, 3 figures, to appear at URSI-2011 conferenc